[BUG] Incorrect expression naming for struct get #1832

kevinzwang · 2024-02-02T00:14:31Z

The reason why select from #1810 wasn't working was because there was a discrepancy between the name element in the Field struct of the struct get expression and the name() method of the expression.

This PR is a quick fix to that behavior for that specific expression, but ideally we would want to consolidate the logic between Expr::to_field and Expr::name so that there aren't two code paths for getting the name of an expression. The problem behind that is Expr::to_field takes in a schema, which is not always easily obtainable during the calls we make in the code to Expr::name. Notably, Python code in the physical plan and execution step generally do not use the schema, but do use expression names. Thus always requiring a schema for Expr::name would not work.

This raises the question: would a schema ever affect the name of an expression? The conclusion from a conversation with @clarkzinzow and @jaychia was no. Thus, would it be reasonable to separate the name from an expression's field, so that Expr::name was the single source of truth?

Moreover, in the future we would like to do table-scoped columns so that we are able to construct expressions that take data from multiple tables. Once we do that, it should be possible to derive the dtype of an expression by determining the dtypes of the columns or literals which are leaves in the expression tree. Then, we wouldn't even need to pass around a schema altogether?

Anyway, lots to think about. We'll leave this as tech debt that should be reworked when implement table-scoped columns, since that will likely require rethinking a lot about schemas and expressions as well.

codecov · 2024-02-02T00:24:39Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (caa7bab) 85.62% compared to head (7e4578b) 85.54%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1832      +/-   ##
==========================================
- Coverage   85.62%   85.54%   -0.09%     
==========================================
  Files          55       55              
  Lines        6102     6102              
==========================================
- Hits         5225     5220       -5     
- Misses        877      882       +5

see 4 files with indirect coverage changes

samster25

LGTM!

samster25 · 2024-02-05T19:13:39Z

@kevinzwang can you also make a ticket noting the tech debt and what the suggested solution is?

fix Expr::name for struct get

58bb051

kevinzwang requested review from samster25 and jaychia February 2, 2024 00:14

kevinzwang linked an issue Feb 2, 2024 that may be closed by this pull request

[BUG] Cannot select 2 fields at once from a df using struct accessor #1810

Closed

github-actions bot added the bug Something isn't working label Feb 2, 2024

kevinzwang marked this pull request as ready for review February 2, 2024 00:15

jaychia approved these changes Feb 2, 2024

View reviewed changes

samster25 approved these changes Feb 5, 2024

View reviewed changes

Merge remote-tracking branch 'origin' into kevin/struct-getter-name

7e4578b

kevinzwang mentioned this pull request Feb 7, 2024

Expression name and field consistency #1848

Open

kevinzwang merged commit fbc026a into main Feb 7, 2024
42 checks passed

kevinzwang deleted the kevin/struct-getter-name branch February 7, 2024 09:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Incorrect expression naming for struct get #1832

[BUG] Incorrect expression naming for struct get #1832

kevinzwang commented Feb 2, 2024 •

edited

Loading

codecov bot commented Feb 2, 2024 •

edited

Loading

samster25 left a comment

samster25 commented Feb 5, 2024

[BUG] Incorrect expression naming for struct get #1832

[BUG] Incorrect expression naming for struct get #1832

Conversation

kevinzwang commented Feb 2, 2024 • edited Loading

codecov bot commented Feb 2, 2024 • edited Loading

Codecov Report

samster25 left a comment

Choose a reason for hiding this comment

samster25 commented Feb 5, 2024

kevinzwang commented Feb 2, 2024 •

edited

Loading

codecov bot commented Feb 2, 2024 •

edited

Loading